10 research outputs found
How to See with an Event Camera
Seeing enables us to recognise people and things, detect motion, perceive our 3D environment and more. Light stimulates our eyes, sending electrical impulses to the brain where we form an image and extract useful information. Computer vision aims to endow computers with the ability to interpret and understand visual information - an artificial analogue to human vision. Traditionally, images from a conventional camera are processed by algorithms designed to extract information. Event cameras are bio-inspired sensors that offer improvements over conventional cameras. They (i) are fast, (ii) can see dark and bright at the same time, (iii) have less motion-blur, (iv) use less energy and (v) transmit data efficiently. However, it is difficult for humans and computers alike to make sense of the raw output of event cameras, called events, because events look nothing like conventional images. This thesis presents novel techniques for extracting information from events via: (i) reconstructing images from events then processing the images using conventional computer vision and (ii) processing events directly to obtain desired information. To advance both fronts, a key goal is to develop a sophisticated understanding of event camera output including its noise properties. Chapters 3 and 4 present fast algorithms that process each event upon arrival to continuously reconstruct the latest image and extract information. Chapters 5 and 6 apply machine learning to event cameras, letting the computer learn from a large amount of data how to process event data to reconstruct video and estimate motion. I hope the algorithms presented in this thesis will take us one step closer to building intelligent systems that can see with event cameras
CED: Color Event Camera Dataset
Event cameras are novel, bio-inspired visual sensors, whose pixels output
asynchronous and independent timestamped spikes at local intensity changes,
called 'events'. Event cameras offer advantages over conventional frame-based
cameras in terms of latency, high dynamic range (HDR) and temporal resolution.
Until recently, event cameras have been limited to outputting events in the
intensity channel, however, recent advances have resulted in the development of
color event cameras, such as the Color-DAVIS346. In this work, we present and
release the first Color Event Camera Dataset (CED), containing 50 minutes of
footage with both color frames and events. CED features a wide variety of
indoor and outdoor scenes, which we hope will help drive forward event-based
vision research. We also present an extension of the event camera simulator
ESIM that enables simulation of color events. Finally, we present an evaluation
of three state-of-the-art image reconstruction methods that can be used to
convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to
visualise the event stream, and for use in downstream vision applications.Comment: Conference on Computer Vision and Pattern Recognition Workshop
An Asynchronous Kalman Filter for Hybrid Event Cameras
Event cameras are ideally suited to capture HDR visual information without
blur but perform poorly on static or slowly changing scenes. Conversely,
conventional image sensors measure absolute intensity of slowly changing scenes
effectively but do poorly on high dynamic range or quickly changing scenes. In
this paper, we present an event-based video reconstruction pipeline for High
Dynamic Range (HDR) scenarios. The proposed algorithm includes a frame
augmentation pre-processing step that deblurs and temporally interpolates frame
data using events. The augmented frame and event data are then fused using a
novel asynchronous Kalman filter under a unifying uncertainty model for both
sensors. Our experimental results are evaluated on both publicly available
datasets with challenging lighting conditions and fast motions and our new
dataset with HDR reference. The proposed algorithm outperforms state-of-the-art
methods in both absolute intensity error (48% reduction) and image similarity
indexes (average 11% improvement).Comment: 12 pages, 6 figures, published in International Conference on
Computer Vision (ICCV) 202
An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras
Event cameras are ideally suited to capture High Dynamic Range (HDR) visual
information without blur but provide poor imaging capability for static or
slowly varying scenes. Conversely, conventional image sensors measure absolute
intensity of slowly changing scenes effectively but do poorly on HDR or quickly
changing scenes. In this paper, we present an asynchronous linear filter
architecture, fusing event and frame camera data, for HDR video reconstruction
and spatial convolution that exploits the advantages of both sensor modalities.
The key idea is the introduction of a state that directly encodes the
integrated or convolved image information and that is updated asynchronously as
each event or each frame arrives from the camera. The state can be read-off
as-often-as and whenever required to feed into subsequent vision modules for
real-time robotic systems. Our experimental results are evaluated on both
publicly available datasets with challenging lighting conditions and fast
motions, along with a new dataset with HDR reference that we provide. The
proposed AKF pipeline outperforms other state-of-the-art methods in both
absolute intensity error (69.4% reduction) and image similarity indexes
(average 35.5% improvement). We also demonstrate the integration of image
convolution with linear spatial kernels Gaussian, Sobel, and Laplacian as an
application of our architecture.Comment: 17 pages, 10 figures, Accepted by IEEE Transactions on Pattern
Analysis and Machine Intelligence (TPAMI) in August 202
Asynchronous Tracking-by-Detection on Adaptive Time Surfaces for Event-based Object Tracking
Event cameras, which are asynchronous bio-inspired vision sensors, have shown
great potential in a variety of situations, such as fast motion and low
illumination scenes. However, most of the event-based object tracking methods
are designed for scenarios with untextured objects and uncluttered backgrounds.
There are few event-based object tracking methods that support bounding
box-based object tracking. The main idea behind this work is to propose an
asynchronous Event-based Tracking-by-Detection (ETD) method for generic
bounding box-based object tracking. To achieve this goal, we present an
Adaptive Time-Surface with Linear Time Decay (ATSLTD) event-to-frame conversion
algorithm, which asynchronously and effectively warps the spatio-temporal
information of asynchronous retinal events to a sequence of ATSLTD frames with
clear object contours. We feed the sequence of ATSLTD frames to the proposed
ETD method to perform accurate and efficient object tracking, which leverages
the high temporal resolution property of event cameras. We compare the proposed
ETD method with seven popular object tracking methods, that are based on
conventional cameras or event cameras, and two variants of ETD. The
experimental results show the superiority of the proposed ETD method in
handling various challenging environments.Comment: 9 pages, 5 figure
Asynchronous Spatial Image Convolutions for Event Cameras
Spatial convolution is arguably the most fundamental of two-dimensional image processing operations. Conventional spatial image convolution can only be applied to a conventional image, that is, an array of pixel values (or similar image representation) that are associated with a single instant in time. Event cameras have serial, asynchronous output with no natural notion of an image frame, and each event arrives with a different timestamp. In this letter, we propose a method to compute the convolution of a linear spatial kernel with the output of an event camera. The approach operates on the event stream output of the camera directly without synthesising pseudoimage frames as is common in the literature. The key idea is the introduction of an internal state that directly encodes the convolved image information, which is updated asynchronously as each event arrives from the camera. The state can be read off as often as and whenever required for
use in higher level vision algorithms for real-time robotic systems. We demonstrate the application of our method to corner detection, providing an implementation of a Harris corner-response âstateâ that can be used in real time for feature detection and tracking on robotic systems.This work was supported in part by the Australian Government Research Training Program Scholarship and in part by the Australian Research Council through the âAustralian Centre of Excellence for Robotic Visionâ under Grant CE140100016
CED: Color Event Camera Dataset
Event cameras are novel, bio-inspired visual sensors, whose pixels output asynchronous and independent times-tamped spikes at local intensity changes, called âeventsâ. Event cameras offer advantages over conventional frame-based cameras in terms of latency, high dynamic range(HDR) and temporal resolution. Until recently, event cameras have been limited to outputting events in the intensity channel, however, recent advances have resulted in the development of color event cameras, such as the Color-DAVIS346. In this work, we present and release the first Color Event Camera Dataset (CED), containing 50 minutes of footage with both color frames and events. CED features a wide variety of indoor and outdoor scenes, which we hope will help drive forward event-based vision research.We also present an extension of the event camera simulator ESIM [1] that enables simulation of color events. Finally,we present an evaluation of three state-of-the-art image re-construction methods that can be used to convert the Color-DAVIS346 into a continuous-time, HDR, color video cam-era to visualise the event stream, and for use in downstream vision applications